multiplication problem
Chain-of-Thought Tokens are Computer Program Variables
Zhu, Fangwei, Wang, Peiyi, Sui, Zhifang
Chain-of-thoughts (CoT) requires large language models (LLMs) to generate intermediate steps before reaching the final answer, and has been proven effective to help LLMs solve complex reasoning tasks. However, the inner mechanism of CoT still remains largely unclear. In this paper, we empirically study the role of CoT tokens in LLMs on two compositional tasks: multi-digit multiplication and dynamic programming. While CoT is essential for solving these problems, we find that preserving only tokens that store intermediate results would achieve comparable performance. Furthermore, we observe that storing intermediate results in an alternative latent form will not affect model performance. We also randomly intervene some values in CoT, and notice that subsequent CoT tokens and the final answer would change correspondingly. These findings suggest that CoT tokens may function like variables in computer programs but with potential drawbacks like unintended shortcuts and computational complexity limits between tokens. The code and data are available at https://github.com/solitaryzero/CoTs_are_Variables.
Language Models Do Hard Arithmetic Tasks Easily and Hardly Do Easy Arithmetic Tasks
Gambardella, Andrew, Iwasawa, Yusuke, Matsuo, Yutaka
The ability (and inability) of large language models (LLMs) to perform arithmetic tasks has been the subject of much theoretical and practical debate. We show that LLMs are frequently able to correctly and confidently predict the first digit of n-digit by m-digit multiplication tasks without using chain of thought reasoning, despite these tasks require compounding operations to solve. Simultaneously, LLMs in practice often fail to correctly or confidently predict the last digit of an n-digit by m-digit multiplication, a task equivalent to 1-digit by 1-digit multiplication which can be easily learned or memorized. We show that the latter task can be solved more robustly when the LLM is conditioned on all of the correct higher-order digits, which on average increases the confidence of the correct last digit on 5-digit by 5-digit multiplication tasks using Llama 2-13B by over 230% (0.13 to 0.43) and Mistral-7B by 150% (0.22 to 0.55).
Solving the multiplication problem of a large language model system using a graph-based method
Tuncer, Turker, Dogan, Sengul, Baygin, Mehmet, Barua, Prabal Datta, Hafeez-Baig, Abdul, Tan, Ru-San, Chakraborty, Subrata, Acharya, U. Rajendra
The generative pre-trained transformer (GPT)-based chatbot software ChatGPT possesses excellent natural language processing capabilities but is inadequate for solving arithmetic problems, especially multiplication. Its GPT structure uses a computational graph for multiplication, which has limited accuracy beyond simple multiplication operations. We developed a graph-based multiplication algorithm that emulated human-like numerical operations by incorporating a 10k operator, where k represents the maximum power to base 10 of the larger of two input numbers. Our proposed algorithm attained 100% accuracy for 1,000,000 large number multiplication tasks, effectively solving the multiplication challenge of GPT-based and other large language models. Our work highlights the importance of blending simple human insights into the design of artificial intelligence algorithms. Keywords: Graph-based multiplication; ChatGPT; Multiplication problem
Did GoogleAI Just Snooker One of Silicon Valley's Sharpest Minds?
In 1904, the horse du jour was Clever Hans, widely reputed to be so much smarter than his brethren that he could do math, tell time, and even read and spell. Word spread fast by word of mouth, and eventually the occasionally gullible The New York Times reported that Hans was so smart that he "can do almost everything but talk". Ask Hans what 12 plus 13 is, and he would stamp his feet 25 times. People were amazed, and paid good money to see him. Turns out the horse knew no math; it had solved the arithmetic problems--all of them --in a different way.
- Education (0.69)
- Information Technology (0.50)